Analysis of the Characteristics of Production Database Workloads and Comparison with the TPC Benchmarks
نویسندگان
چکیده
There has been very little empirical analysis of any real production database workloads. Although The Transaction Processing Performance Council benchmarks C (TPC-C) and D (TPC-D) have become the standard benchmarks for online transaction processing and decision support systems respectively, there has also not been any major effort to systematically analyze their workload characteristics, especially in relation to those of real production database workloads. In this paper, we examine the characteristics of the production database workloads of ten of the world’s largest corporations and we also compare them to TPC-C and TPC-D. We find that the production workloads exhibit a wide range of behavior; in some cases, the TPC benchmarks fall reasonably within the range of real workload behavior, and in other cases, the TPC benchmarks are not representative of the real workloads. In general, the two TPC benchmarks complement one another in reflecting the characteristics of the production workloads but there are still some aspects of the real workloads that are not represented by either of the benchmarks. Specifically, our analysis suggests that the TPC benchmarks tend to exercise the following aspects of the system differently than the production workloads: concurrency control mechanism (TPC-C tends to have longer transactions and fewer read-only transactions than the production workloads while some of TPCD’s transactions are much longer but are read-only and are run serially), workload-adaptive techniques (the production workloads have I/O demands that are much more bursty), scheduling and resource allocation policies (unlike TPC-C which has very regular transactions and TPC-D which has long queries that are run serially, the production workloads tend to have many concurrent and diverse transactions), and I/O optimizations for temporary and index files (TPC-C has no I/O activity to temporary objects while most of TPC-D’s references are directed at index objects). In this paper, we also reexamine Amdahl’s rule of thumb for a typical data processing system (one bit of I/O for every instruction) and discover that both the TPC benchmarks and the production workloads generate on the order of 0.5 to 1.0 bit of logical I/O per instruction, surprisingly close to the much earlier figure.
منابع مشابه
Characteristics of production database workloads and the TPC benchmarks
There has been very little empirical analysis of any real production database workloads. Although the Transaction Processing Performance Council benchmarks C (TPC-C) and D (TPC-D) have become the standard benchmarks for on-line transaction processing and decision support systems, respectively, there has not been any major effort to systematically analyze their workload characteristics, especial...
متن کاملI/O reference behavior of production database workloads and the TPC benchmarks - an analysis at the logical level
As improvements in processor performance continue to far outpace improvements in storage performance, I/O is increasingly the bottleneck in computer systems, especially in large database systems that manage huge amounts of data. The key to achieving good I/O performance is to thoroughly understand its characteristics. In this paper, we present a comprehensive analysis of the logical I/O referen...
متن کاملMeasuring Database Performance in Online Services: A Trace-Based Approach
Many large-scale online services use structured storage to persist metadata and sometimes data. The structured storage is typically provided by standard database servers such as Microsoft’s SQL Server. It is important to understand the workloads seen by these servers, both for provisioning server hardware as well as to exploit opportunities for energy savings and server consolidation. In this p...
متن کاملDBmbench: fast and accurate database workload representation on modern microarchitecture
With the proliferation of database workloads on servers, much recent research on server architecture has focused on database system benchmarks. The TPC benchmarks for the two most common server workloads, OLTP and DSS, have been used extensively in the database community to evaluate the database system functionality and performance. Unfortunately, these benchmarks fall short of being effective ...
متن کاملStream Processing Systems Have Arrived at the Big Data Party. But Where Are All the Benchmarks?
Stream processing systems have now become an integral part of the Big Data ecosystem. Unfortunately, streaming benchmarks have not followed suit leading to non-representative benchmarking of systems. Benchmarks in general have many use cases including: (a) comparing two or more systems, (b) matching applications and workloads to systems, and (c) configuring and optimizing a system. Due to these...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1999